Separable Architecture for Fault Isolation and Recovery

نویسندگان

  • Matthew C. Ruschmann
  • John McGreevy
چکیده

Fault management is one of the key technologies that enable distributed and disaggregated mission architectures wherein multiple vehicles work cooperatively and autonomously in a cluster or formation, a typical mission concept involving small satellites. In this paper, we describe a software architecture, called Separable Architecture for Fault Isolation and Recovery (SAFIR), which addresses fault management for these types of missions. Although SAFIR is applicable to any system of systems, this paper demonstrates SAFIR for a cluster of spacecraft. The resulting fault detection, isolation, and recovery benefits from the SAFIR architecture because it is robust to intermittent communication and highly modular. The SAFIR software has been developed as apps for the Core Flight System (cFS) and has been demonstrated successfully on representative hardware using a high fidelity simulation of spacecraft in low earth orbit.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault Isolation and Quick Recovery in Isolation File Systems

Lanyue Lu presented isolation file systems, providing fault isolation and quick recovery within a single file system. Because file systems are important data access interfaces in many environments, high availability is critical; however, a single fault can trigger a large-scale impact for the whole file system, such as remounting as read-only and a system crash. Lanyue explained how a metadata ...

متن کامل

ObjectAgent for Robust Autonomous Control

The ObjectAgent system is being developed to create a robust software architecture for autonomous control of complex systems. Agents are used to implement all of the software functionality and communicate through simplified natural language messages. These agents have a set of basic survival skills that monitor for internal software faults, providing low-level fault detection and recovery. High...

متن کامل

Reliability Analysis for Train Control System by Hardware Redundancy Architecture in Fault Tolerance System

Train control system is a vital system due to controlling the speed and interlocking of train in railway. The train control system is designed by double module or triple module system as a vital system. Hardware redundancy means to use additional hardware to defect and tolerant faults. There are three forms of hardware redundancy: passive, active and hybrid. Passive redundancy architecture achi...

متن کامل

Performance Evaluation of Recursive Network Architecture for Fault-tolerance

Network fault tolerance is one of the most important capabilities required by mission-critical systems such as the naval Combat System Data Network (CSDN). In this paper, we present performance evaluation results of a fault-tolerant network scheme called Recursive Scalable Autonomous Faulttolerant Ethernet (RSAFE). The primary goal of RSAFE scheme is to provide network scalability, and autonomo...

متن کامل

Protected Ethernet Rings for Optical Access Networks

In this paper we propose a centralized link layer architecture for providing low latency fault recovery for optical access rings. This architecture exploits the naturally uneven breakdown of network management responsibilities between the components of an access ring. Important administrative operations like ring status checking, fault detection and recovery are aggregated at the HUB component ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017